robust loss function
Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels
Paquin, Alexandre Lemire, Chaib-Draa, Brahim, Giguère, Philippe
Labeling a training set is often expensive and susceptible to errors, making the design of robust loss functions for label noise an important problem. The symmetry condition provides theoretical guarantees for robustness to such noise. In this work, we study a symmetrization method arising from the unique decomposition of any multi-class loss function into a symmetric component and a class-insensitive term. In particular, symmetrizing the cross-entropy loss leads to a linear multi-class extension of the unhinged loss. Unlike in the binary case, the multi-class version must have specific coefficients in order to satisfy the symmetry condition. Under suitable assumptions, we show that this multi-class unhinged loss is the unique convex multi-class symmetric loss. We also show that it has a fundamental local role: the linear approximation of any symmetric loss around score vectors with equal components is equivalent to the multi-class unhinged loss. We then introduce SGCE and alpha-MAE, two loss functions that interpolate between the multi-class unhinged loss and the Mean Absolute Error while allowing control of the beta-smoothness of the loss. Experiments on standard noisy-label benchmarks show competitive performance compared with existing robust loss functions.
Adaptive Factor Graph-Based Tightly Coupled GNSS/IMU Fusion for Robust Positionin
Ahmadi, Elham, Olama, Alireza, Välisuo, Petri, Kuusniemi, Heidi
Reliable positioning in GNSS-challenged environments remains a critical challenge for navigation systems. Tightly coupled GNSS/IMU fusion improves robustness but remains vulnerable to non-Gaussian noise and outliers. We present a robust and adaptive factor graph-based fusion framework that directly integrates GNSS pseudorange measurements with IMU preintegration factors and incorporates the Barron loss, a general robust loss function that unifies several m-estimators through a single tunable parameter. By adaptively down weighting unreliable GNSS measurements, our approach improves resilience positioning. The method is implemented in an extended GTSAM framework and evaluated on the UrbanNav dataset. The proposed solution reduces positioning errors by up to 41% relative to standard FGO, and achieves even larger improvements over extended Kalman filter (EKF) baselines in urban canyon environments. These results highlight the benefits of Barron loss in enhancing the resilience of GNSS/IMU-based navigation in urban and signal-compromised environments.
Variation-Bounded Loss for Noise-Tolerant Learning
Wang, Jialiang, Zhou, Xiong, Liu, Xianming, Hu, Gangfeng, Zhai, Deming, Jiang, Junjun, Li, Haoliang
Mitigating the negative impact of noisy labels has been a perennial issue in supervised learning. Robust loss functions have emerged as a prevalent solution to this problem. In this work, we introduce the V ariation Ratio as a novel property related to the robustness of loss functions, and propose a new family of robust loss functions, termed V ariation-Bounded Loss (VBL), which is characterized by a bounded variation ratio. We provide theoretical analyses of the variation ratio, proving that a smaller variation ratio would lead to better robustness. Furthermore, we reveal that the variation ratio provides a feasible method to relax the symmetric condition and offers a more concise path to achieve the asymmetric condition. Based on the variation ratio, we reformulate several commonly used loss functions into a variation-bounded form for practical applications.
Combating Noisy Labels via Dynamic Connection Masking
Zhang, Xinlei, Liu, Fan, Zhang, Chuanyi, Cheng, Fan, Zheng, Yuhui
Noisy labels are inevitable in real-world scenarios. Due to the strong capacity of deep neural networks to memorize corrupted labels, these noisy labels can cause significant performance degradation. Existing research on mitigating the negative effects of noisy labels has mainly focused on robust loss functions and sample selection, with comparatively limited exploration of regularization in model architecture. Inspired by the sparsity regularization used in Kolmogorov-Arnold Networks (KANs), we propose a Dynamic Connection Masking (DCM) mechanism for both Multi-Layer Perceptron Networks (MLPs) and KANs to enhance the robustness of classifiers against noisy labels. The mechanism can adaptively mask less important edges during training by evaluating their information-carrying capacity. Through theoretical analysis, we demonstrate its efficiency in reducing gradient error. Our approach can be seamlessly integrated into various noise-robust training methods to build more robust deep networks, including robust loss functions, sample selection strategies, and regularization techniques. Extensive experiments on both synthetic and real-world benchmarks demonstrate that our method consistently outperforms state-of-the-art (SOTA) approaches. Furthermore, we are also the first to investigate KANs as classifiers against noisy labels, revealing their superior noise robustness over MLPs in real-world noisy scenarios. Our code will soon be publicly available.
Introducing Fractional Classification Loss for Robust Learning with Noisy Labels
Kurucu, Mert Can, Kumbasar, Tufan, Eksin, İbrahim, Güzelkaya, Müjde
Robust loss functions are crucial for training deep neural networks in the presence of label noise, yet existing approaches require extensive, dataset-specific hyperparameter tuning. In this work, we introduce Fractional Classification Loss (FCL), an adaptive robust loss that automatically calibrates its robustness to label noise during training. Built within the active-passive loss framework, FCL employs the fractional derivative of the Cross-Entropy (CE) loss as its active component and the Mean Absolute Error (MAE) as its passive loss component. With this formulation, we demonstrate that the fractional derivative order $μ$ spans a family of loss functions that interpolate between MAE-like robustness and CE-like fast convergence. Furthermore, we integrate $μ$ into the gradient-based optimization as a learnable parameter and automatically adjust it to optimize the trade-off between robustness and convergence speed. We reveal that FCL's unique property establishes a critical trade-off that enables the stable learning of $μ$: lower log penalties on difficult or mislabeled examples improve robustness but impose higher penalties on easy or clean data, reducing model confidence in them. Consequently, FCL can dynamically reshape its loss landscape to achieve effective classification performance under label noise. Extensive experiments on benchmark datasets show that FCL achieves state-of-the-art results without the need for manual hyperparameter tuning.
Joint Asymmetric Loss for Learning with Noisy Labels
Wang, Jialiang, Liu, Xianming, Zhou, Xiong, Hu, Gangfeng, Zhai, Deming, Jiang, Junjun, Ji, Xiangyang
Learning with noisy labels is a crucial task for training accurate deep neural networks. To mitigate label noise, prior studies have proposed various robust loss functions, particularly symmetric losses. Nevertheless, symmetric losses usually suffer from the underfitting issue due to the overly strict constraint. To address this problem, the Active Passive Loss (APL) jointly optimizes an active and a passive loss to mutually enhance the overall fitting ability. Within APL, symmetric losses have been successfully extended, yielding advanced robust loss functions. Despite these advancements, emerging theoretical analyses indicate that asymmetric losses, a new class of robust loss functions, possess superior properties compared to symmetric losses. However, existing asymmetric losses are not compatible with advanced optimization frameworks such as APL, limiting their potential and applicability. Motivated by this theoretical gap and the prospect of asymmetric losses, we extend the asymmetric loss to the more complex passive loss scenario and propose the Asymetric Mean Square Error (AMSE), a novel asymmetric loss. We rigorously establish the necessary and sufficient condition under which AMSE satisfies the asymmetric condition. By substituting the traditional symmetric passive loss in APL with our proposed AMSE, we introduce a novel robust loss framework termed Joint Asymmetric Loss (JAL). Extensive experiments demonstrate the effectiveness of our method in mitigating label noise. Code available at: https://github.com/cswjl/joint-asymmetric-loss
Heavy Lasso: sparse penalized regression under heavy-tailed noise via data-augmented soft-thresholding
High-dimensional linear regression is a fundamental tool in modern statistics, particularly when the number of predictors exceeds the sample size. The classical Lasso, which relies on the squared loss, performs well under Gaussian noise assumptions but often deteriorates in the presence of heavy-tailed errors or outliers commonly encountered in real data applications such as genomics, finance, and signal processing. To address these challenges, we propose a novel robust regression method, termed Heavy Lasso, which incorporates a loss function inspired by the Student's t-distribution within a Lasso penalization framework. This loss retains the desirable quadratic behavior for small residuals while adaptively downweighting large deviations, thus enhancing robustness to heavy-tailed noise and outliers. Heavy Lasso enjoys computationally efficient by leveraging a data augmentation scheme and a soft-thresholding algorithm, which integrate seamlessly with classical Lasso solvers. Theoretically, we establish non-asymptotic bounds under both $\ell_1$ and $\ell_2 $ norms, by employing the framework of localized convexity, showing that the Heavy Lasso estimator achieves rates comparable to those of the Huber loss. Extensive numerical studies demonstrate Heavy Lasso's superior performance over classical Lasso and other robust variants, highlighting its effectiveness in challenging noisy settings. Our method is implemented in the R package heavylasso available on Github.
ASRL:A robust loss function with potential for development
Hui, Chenyu, Zhang, Anran, Li, Xintong
Abstract--In this article, we proposed a partition-wise robust loss function (ASRL -Adapative segmented robust loss)based on the previous robust loss function. The characteristics of this loss function are that it achieves high robustness and a wide range of applicability through partition-wise design and adaptive parameter adjustment. Finally, the advantages and development potential of this loss function were verified by applying this loss function to the XGBoost and using five different datasets (with different dimensions, different sample numbers, and different fields) to compare with the XGBoost using other loss functions. The results of multiple experiments have proven the advantages of ASRL in MSE, MAE, R2, etc. ASRL's dynamic segmentation design and adaptive threshold make it more robust and can be applied to more fields, such as as a loss function for multimodal learning and reinforcement learning, and has a large room for development.The implementation code repository github link in this paper is:ASRLCODE Index Terms--ASRL,Robustness,MSE,MAE,Loss Function I. INTRODUCTION In regression prediction of machine learning, the loss function is the core tool to measure the difference between the model prediction value and the true value. Its role runs through the entire process of model training, optimization and evaluation.